Pre-stack Kirchhoff Time Migration on Hadoop and Spark
نویسندگان
چکیده
Pre-stack Kirchhoff time migration (PKTM) is one of the most widely used migration algorithms in seismic imaging area. However, PKTM takes considerable time due to its high computational cost, which greatly affects the working efficiency of oil industry. Due to its high fault tolerance and scalability, Hadoop has become the most popular platform for big data processing. To overcome the shortcoming too much network traffic and disk I/O in Hadoop, there shows up a new distributed framework—Spark. However the behaviour and performance of those two systems when applied to high performance computing are still under investigation. In this paper, we proposed two parallel algorithms of the plre-stack Kirchhoff time migration based on Hadoop and Sark respectively. Experiments are carried out to compare the performances of them. The results show that both of implementations are efficient and scalable and our PKTM on Spark exhibits better performance than the one on Hadoop.
منابع مشابه
Experiences Running and Optimizing the Berkeley Data Analytics Stack on Cray Platforms
The Berkeley Data Analytics Stack (BDAS) is an emerging framework for big data analytics. It consists of the Spark analytics framework, the Tachyon in-memory filesystem, and the Mesos cluster manager. Spark was designed as an in-memory replacement for Hadoop that can in some cases improve performance by up to 100X. In this paper, we describe our experiences running BDAS on the new Cray Urika-XA...
متن کاملMicrosoft Word - EvaluationOfJava_ieeeformat_2.docx
Abstract—In the last few years, Java gain popularity in processing “big data” mostly with Apache big data stack – a collection of open source frameworks dealing with abundant data, which includes several popular systems such as Hadoop, Hadoop Distributed File System (HDFS), and Spark. Efforts have been made to introduce Java to High Performance Computing (HPC) as well in the past, but were no...
متن کاملSeismic Imaging with the Generalized Radon Transform: a Curvelet Transform Perspective
A key challenge in the seismic imaging of reflectors using surface reflection data is the subsurface illumination produced by a given data set and for a given complexity of the background model (of wavespeeds). The imaging is described here by the generalized Radon transform. To address the illumination challenge and enable (accurate) local parameter estimation, we develop a method for partial ...
متن کاملDiGeST: Distributed Computing for Scalable Gene and Variant Ranking with Hadoop/Spark
Background: The advent of next-generation sequencing technologies has opened new avenues for clinical genomics research. In particular, as sequencing costs continue to decrease, an ever-growing number of clinical genomics institutes now rely on DNA sequencing studies at varying scales genome, exome, mendeliome for uncovering disease-associated variants or genes, in both rare and non-rare diseas...
متن کاملPerformance Benefits of DataMPI: A Case Study with BigDataBench
Apache Hadoop and Spark are gaining prominence in Big Data processing and analytics. Both of them are widely deployed on Internet companies. On the other hand, high-performance data analysis requirements are causing academical and industrial communities to adopt state-of-the-art technologies in HPC to solve Big Data problems. Recently, we have proposed a key-value pair based communication libra...
متن کامل